Cognitive OFDM network sensing: 
a free probability approach 
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Abstract — In this paper, a practical power detection scheme 
for OFDM terminals, based on recent free probability tools, is 
proposed. The objective is for the receiving terminal to determine 
the transmission power and the number of the surrounding base 
stations in the network. However, the system dimensions of the 
network model turn energy detection into an under-determined 
problem. The focus of this paper is then twofold: (i) discuss the 
maximum amount of information that an OFDM terminal can 
gather from the surrounding base stations in the network, (ii) 
propose a practical solution for blind cell detection using the free 
deconvolution tool. The efficiency of this solution is measured 
through simulations, which show better performance than the 
classical power detection methods. 



I. Introduction 

The ever increasing demand of high data rate has pushed 
system designers to exploit the wireless channel medium to the 
smallest granularity. In this respect, the orthogonal frequency 
division multiplexing (OFDM) modulation has been chosen as 
the next common standard in most wireless communication 
systems, e.g. WiMax [5], 3GPP-LTE [4]. OFDM converts 
a frequency selective fading channel into a set of flat fad- 
ing channels [20], therefore providing a high flexibility in 
terms of power and rate allocation. Future wireless networks 
therefore tend to be based on highly loaded OFDM cells. 
However, in multiple cell environments, inter-cell interference 
is still the bottleneck factor which considerably reduces the 
network-wide capacity. Cooperation between base stations 
are envisioned to reach the capacity performance of the so- 
called broadcast channel [13], but many problems (essen- 
tially of power allocation and user scheduling) prevent those 
solutions to appear soon in practical standards. Therefore, 
it is essential for mobile terminals to be able to determine 
which neighboring cell provides the best quality of service, 
so that the terminal quickly hands over this best performance 
base station. Classically, only scarce and narrow-band pilot 
sequences allow the terminals to estimate the transmission 
power of the main surrounding base stations, e.g. in 3GPP- 
LTE, two sequences of the 0.7 MHz band are available every 



5 ms. Those synchronization sequences are usually affected by 
fast channel fading and overlap data from other base stations; 
as a consequence numerous occurrences of those pilots need 
be accumulated to achieve a satisfying estimation of the base 
stations transmission power 

The classical alternative to the pilot-aided (also referred 
to as data-aided) power detection is to perform a blind 
estimation from the incoming interfering signals. This raises 
the fundamental cognitive radio question [23], [12], which 
will be an important topic of the present work: "how much 
information can a cognitive receiver recover from the incoming 
signals?". The response to this question answers two clas- 
sical concerns of engineers and system designers: (i) is the 
additional information brought by blind detection worth the 
computational effort?, (ii) is some given blind detector solution 
far from providing all the accessible information?. It is clear 
in particular, from an information theoretic viewpoint, that 
the information received on the N OFDM subcarriers must 
ideally not be filtered in order to provide as much information 
as possible on the problem at hand, i.e. any filtering process 
diminishes the available information in the Shannon's sense 
[1]. Therefore, if as many as L consecutive OFDM symbols 
are received, the available information is contained in the 
received N x L matrix Y, with N typically large. As a 
consequence, since L cannot be taken infinitely large, N/ L is 
non trivial. This leads to the study of large random matrices 
problems, which is currently a hot topic in the wireless 
communication community [8]. This is in sharp contrast with 
classical power detection methods [10], [11] which are only 
asymptotically unbiased, i.e. these methods assume that one 
of the system dimensions is large with respect to the others 
and this condition is necessary to ensure the convergence of 
the underlying algorithms. 

Our purpose is to retrieve relevant information on the base 
station transmission powers. It will be shown hereafter that, 
depending on the a priori knowledge of the receiver, the 
essential part of the power information is, in most practical 
situations, contained in the eigenvalue distribution of the 
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matrix -^YY'^. This naturally leads to the consideration of 
recent research on random matrix theory (RMT) [8] and more 
specifically on free deconvolution [16]. In particular, in [7], 
a similar study of terminal power detection in code division 
multiple access (CDMA) networks is derived from these tools. 
However, the model in [7] only considers flat fading channels 
and dodges the difficulty of multi-path channels; moreover 
the structure of the CDMA encoding matrix allows to easily 
recover the transmitted signal variances, which is not the case 
of multi-cell OFDM in which multiple streams overlap with 
no dedicated code to separate them. 

We propose here first to discuss the optimal amount of 
information which the receiver can extract from the incoming 
data to blindly retrieve the values of the powers transmitted by 
all surrounding base stations, when the receiver's prior state of 
knowledge about the environment is very limited. From this 
analysis, it will be observed that the general problem is not 
very tractable both in terms of mathematical derivation and 
therefore in terms of practical implementation. Secondly, we 
propose a suboptimal but implementable approach to solve 
the cell detection problem, based on the free probability 
framework, for which we derive a novel signal detection 
algorithm for OFDM. 

The remainder of the paper is structured as follows: In 
Section Inl we introduce the model of the multiple cell OFDM 
network. In Section InD we discuss the amount of information 
about transmitted powers which can be collected by the 
terminal from the received matrix Y. The observation that 
all the necessary information is contained in the eigenvalues 
of -^YY'^ leads to Section IIVI in which we evaluate the 
classical energy detection methods, which assume L/N oo. 
In Section |V] we introduce some basic concepts of random 
matrix theory [8], which are needed to the understanding of 
the subsequent sections. In Section IVII we provide a novel 
algorithm to detect the cell transmission powers. Simulation 
results are then provided in Section IVIII A discussion on the 
gains and limitations of this novel method is carried out in 
Section IVIIII Finally, in Section |IX] we draw our conclusions. 

Notations: In the following, boldface lower case symbols 
represent vectors, capital boldface characters denote matrices 
(Ijv is the size-A^ identity matrix). The spaces M(yi, and 
M(yi, z) are the sets of i x j and i x i matrices over the 
algebra A, respectively. The transpose and Hermitian transpose 
operators are denoted (•)^ and (•)'^, respectively. The operator 
diag(x) turns the vector x into a diagonal matrix. The function 



denotes the indicator function. 



II. Downlink Model 



Consider a network of Nb base stations and one terminal 
equipped with a single receiving antenna. This scenario is 
depicted in Figure [T] Assume moreover that the receiver 
is akeady connected to one serving base station, which we 
purposely exclude from the set of the Nb surrounding base 
stations since we akeady know all information about it. The 
network is assumed to be synchronized in time and frequency 
and to use OFDM modulation with a size N discrete Fourier 
transform (DFT). Denote then M the maximum number of 
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Fig. L System Model 



base stations the receiver expects to detect. Ideally Nb < 
M. In the following, for simplification, we assume that this 
condition is always fulfilled and we will only consider the 
parameter M, considering then a network of M base stations, 
of which some could be of null power The link between the 
terminal and the base station k is modelled as a fast-fading 
complex frequency domain channel = [hki . . . hknY ^ 
C^, coupled to a slow-fading path loss Lk = l/Pk with Pk 
the mean received power originating from base station k. The 
terminal also receives additive white Gaussian noise an e 
with entries of variance a^. The base station k sends at time I 
the frequency-domain OFDM symbol s^,'-* = [^il, • • • , s^NkV ■ 
Therefore, the received signal vector y^'^ — [ui, ■ ■ ■ ,2/^^]^ 
at time / reads 



M-l 



.(0 



E 

fe=0 



(0 



an 



(1) 



(1) 



(2) 



with Dfe = diag(hfe). 

This summation over the M cells can be rewritten 

with 0*'^ e C*^^ the concatenated vector 0*^'^ = 
[s'{\ . . . , s'{^]^, H e M(C, N, MN) the concatenated matrix 
of the Dfe, fee {!,..., M} 

hii ■ ■ ■ • • • /iMi • • ■ 



H 

■■■ h2N ■■■ 

and P e M(M, N) the diagonal matrix 
"Pi ••• 







T-MN 



(3) 



Po 













Pm 



(4) 



Now assume that the M channels have a coherence time 
of order (or more than) L times the OFDM symbol duration. 
The L samples y('\ / = 1, . . . , L, can be concatenated into 
nn N X L matrix Y = [y*^-*-) • • • y*-'^-'] to lead to the more 
general matrix expression of the received signal Y 



HP2 + crN 



(5) 



where e M(C, MN, L) and N e M(C, A^, L) are the con- 
catenation matrices of the L vectors 0*^''' and n^'^ respectively. 
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III. Problem statement 

The power detection problem consists in the present situa- 
tion to retrieve the entries of the matrix P from the receive 
matrix Y. The prior information at the receiver is not sufficient 
to find P in a straightforward manner. Indeed, the channel 
matrix H and the transmitted signal are unknown and, 
worse, even their stochastic distribution are usually unknown; 
in particular, recent flexible multiple access OFDM standards 
adapt their transmission rates to the channel quality so that 
the terminal cannot a priori assume to receive either QPSK, 
16-QAM, 64-QAM or any other type of modulation. The a 
priori knowledge / at the terminal is therefore limited to: (i) 
the approximated background covariance E[nn'^] = ct^I, (ii) 
the fast-fading channel power E[h5|!hfe] = 1, (iii) the channel 
delay spread known to be lesser than the cyclic prefix length, 
(iv) the transmitted signal covariance E[05j'0fc] = I^v. 

From this amount of prior information /, the most reli- 
able channel modeQ is obtained from the maximum entropy 
principle [14]. The latter states that the transmitted data 
must be modelled as a Gaussian independent and identically 
distributed process (i.i.d.). As for the short-term channels hfc, 
given a delay spread (counted in integer number of time- 
domain samples), the time-domain representation of h^: must 
be modelled as a Gaussian i.i.d. vector of length Td.; therefore, 
hfc is to be modelled as a Gaussian vector with covariance 
matrix the DFT of 1^^. Since little information about is 
initially known to the receiver, the channels must be modelled 
as the marginal distribution of those Gaussian processes with 
Td varying from 1 to the cyclic prefix length. 

Of course, this model might be very different from reality 
and might provide totally wrong results, as longly discussed 
in [15]. However, this is the best one can blindly infer 
on the transmission scheme from the available information. 
The objective now is to determine what is the probability 
p{P\I) that a sequence of transmitted powers {Pi, . . . , Pm} 
fits the previous model knowing /. From those probabilities, 
computed for all vectors in (M+)*^, an estimate P of P 
can be designed which minimizes some error measure, e.g. 
P = E[P|Y, /] would be the minimum mean square error 
estimate of P. 

The probability p{P\Y,I) assigned to the information 
(P|Y, /) can be written, thanks to Bayes' rule 

p(P|Y,/)=KY|P,/)-^||j^ (6) 

in which (P|/) is the a priori knowledge about P. It is 
classically assigned a uniform distribution over some sub- 
space [0, Pmax]*^ for a maximum receive power Pmax- As for 
p(Y|P, /), it can be expanded as 

p(Y|P, /) = / p(Y|P, H, 0, /)p(H|/)p(0|/)dHd0 

(7) 

in which all integrands are known from the maximum entropy 
model aforementioned, 

'by "most reliable model", we mean the model which satisfies the con- 
straints imposed by I and which is the most noncommittal regarding unknown 
system information. 



• p{&\I) is Standard multivariate i.i.d. Gaussian. 

• the compound channel H is assigned a distribution 

Tmax 

p(H|/) = ^p(H|Trf,/).p(rd|/) (8) 

k=l 

with Td the channel delay spread, Tmax the cyclic prefix 
length, p{Td\I) = 1/Tmax and p(H|rd,/) with standard 
i.i.d. Gaussian diagonals. 

However, the explicit computation of (|7]i is very involved 
and requires advanced tools from random matrix theory. A 
similar calculus was performed by the authors for the simpler 
single-cell MIMO energy detector [31]. In the latter it was 
shown that, surprisingly, the standard i.i.d. Gaussian model 
assigned to the main system parameters makes the energy 
detection depend only on the eigenvalue distribution of the 
receive matrix Y. The multi-cell detection problem at hand is 
very similar in configuration, apart for the channel marginal- 
ization of equation (O which is not i.i.d. Gaussian. Since we 
cannot provide an optimal information theoretical solution to 
our problem and since both aforementioned problems are very 
similar, it seems relevant to concentrate on the close-to-optimal 
random matrix theoretical approach. 

Some important information can nonetheless be already 
deduced from the integral form of equation (|7]). If the trans- 
mission channels are extremely frequency flat, i.e. for all k, 
hki ^ hk2-- - ^ ft-fejv, then {HP^0},y = Y^k^/^khkOkj- 
Therefore, even if the realizations of N and were perfectly 
known, one will have access at best to the variables ^/Pkhk, 
k — 1, . . . , M, from which no reliable estimation of Pk be 
drawn; in such a situation, the posterior probability p(P| Y, /) 
is very broad and is maximized on a large continuous set 
of Pi, ... , Pm- On an information theoretical viewpoint, this 
means that the optimal inference on P given Y and / 
cannot lead to any valuable information. In the random matrix 
approach, the situation is even worse. If one knew perfectly 
the entries of HPH'^, then nothing at all can be said about 
Pi, ... , Pm- Indeed, {HPHH},y = J^k Pk\hk\^ (see Section 
IVTl for details) and the only piece of information which one 
has about Pi, ... , P/\/ is the sum J^k Pk\hk\'^', the latter cannot 
lead to any estimate of P when M > 1 and the problem cannot 
be solved. 

This means that, given the limited prior information of the 
terminal, it is impossible to come up with a reliable estimate 
of P when the channels are frequency flat. In the remainder of 
this paper, we shall therefore consider that the OFDM channels 
are very frequency selectiv^ In the following, we investigate 
the classical power detection techniques, which shall prove 
inefficient in this large non-trivial matrix problem. 

IV. Classical Power Detection 

Usual power detection considers the second order statistics 
of the received signals. In the scalar case, i.e. y''^ reduces to a 
single value y^^\ it was proved [10] that the optimal detector 
with the aforementioned state of knowledge at the terminal 

-note that this assumption ensures high efficiency of the network in terms 
of per-user outage capacity, which is very desirable in the current trend for 
packet-switched communications. 
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consists in evaluating j-{[y^^\ . . . ,y^-'^'>][y^^\ . . . ,y^-^^]^) — 
cr^, with L >> 1. We show in what follows that this classical 
scheme can be simply extended to our network situation but 
that it is very inefficient for small L/N ratios. 

Assuming that L/N is very large, the expression of the 
normalized Gram matrix associated to Y reads 

rH 



E 



y(HP30 

1j 



HP^E 

hph'^ 



crN)(HP5 



HP 2 



-E 



P^H + E 



L 



L 



(9) 



the last line comes from the fact that, N being finite, the NxN 
matrices -^NN*^ and -^0©'^ converge in distribution to an 
identity matrix, and the cross products to null matrices. 

As a consequence, as will be detailed in Section IVI-BI 
one can estimate the values Pk, k E {1,...,A/} from the 
moments {(-^YY^ - cf^In)'"}, fc G {1, . . . , Af}, when L is 
large compared with N. 

Our situation does not fall into this asymptotic L/N 
oo context. In present and future OFDM technologies, the 
number N of available subcarriers is large, e.g. of order a 
thousand subcarriers, while L is limited in our model to the 
channel coherence time or in general to the number of OFDM 
symbols the terminal is willing to memorize before treating 
information. Therefore, even if N and L are large, their 
ratio N/L is not in general close to zero. The fundamental 
asymptotic assumption is therefore no longer satisfied. 

We show in Table H] that the non-trivial ratio N/ L impairs 
significantly the performance of the classical power detection. 
The latter is the result of a simulation in which we applied 
the algorithm that will be described in Section IVI-BI based 
on the sample moments of ■^(YY'^ — ct^In) (instead of the 
sample moments of -^HPH'^, whose corresponding results 
are shown between brackets). The typical situation considered 
in this example is a three-base station scenario of respective 
powers Pi — A, P2 = 2 and P3 = 1 and a noise level — 
0.1, i.e. SNR = 10 dB, N = 256 and L is taken in a range 
from 256 to 32, 768. It turns out indeed that L needs to be 
large for this method to be satisfying. In this precise example, 
this compels L/N to be of order 64, which is not acceptable 
in our current system settings. 

Such problems involving large matrices with non-trivial 
N/L ratios are at the heart of a recent field of research, 
known as random matrix theory (RMT), which is a particular 
case of the more general free probability theory introduced by 
Voiculescu [22]. In the subsequent section, we provide a quick 
introduction to important notions of RMT which are necessary 
to handle the rest of the multiple cell detection study. 

V. RMT AND FREE DECONVOLUTION 

A. Random matrix theory 

Definition 1: A random matrix is a multi-variate ran- 
{Xii, X12, . . . , Xmn} for a given 



N = 256, P = {Pi,P2,P3} = {4,2, 1} 


L 


Estimated P [our algorithm] 


|P - PII2 


256 


{7.93, 1.62, -2.5} [{4.28, 1.35, 1.35}] 


27.63 


512 


{5.82, 2.90, -1.7} [{3.80, 2.16, 1.00}] 


11.42 


1024 


{4.26, 3.60, -0.8} [{3.62, 2.37, 0.94}] 


5.87 


2048 


{4.52, 2.69, -0.2} [{4.22, 1.55, 1.12}] 


1.77 


4096 


{4.20, 2.65, 0.18} [{4.09, 2.06, 0.78}] 


1.41 


8192 


{4.10, 2.28, 0.58} [{4.05, 1.89, 0.92}] 


0.27 


16384 


{3.97, 2.42, 0.89} [{3.95, 2.24, 0.99}] 


0.19 


32768 


{4.07, 1.95, 0.99} [{4.03, 1.95, 0.98}] 


0.01 



TABLE I 

Classical moment-based method 



{M, N) e couple. As such, X is a matrix whose entries 
Xij e C^^^^ are ruled by a joint probability distribution 
p{Xii,Xi2, . . . , Xj^in). 

Free probability is the study of random variables in non- 
commutative algebras, i.e. algebras in which the product 
operation is non-commutative. The algebra of large Hermitian 
random matrices is a particular case of those non-commutative 
algebras. In the following, we shall qualify free any notion 
attached to the free probability (or RMT) framework while 
we shall qualify classical any notion attached to the classical 
probability framework of commutative algebras. 

Similarly to the classical theory of probability, a free ex- 
pectation functional (p can be defined. For a given Hermitian 
random matrix X, the free expectation reads 



i(X) = lim E[trA,X] 



(10) 



and we can similarly define free moments mk,k G N of a 
random matrix. Those are 



rrik = (X^ 



(11) 



Thanks to the trace properties, note that the free moments 
are strongly linked to the eigenvalues Xi, i E {l,N} of X, 
since rrik can also be written 



rrik 



1 ^ 
lim i-FA: 

N^oo N ^ 



(12) 



Indeed, denote X = QaAQa with A = 
diag({Ai, . . . , Aat}) and Qa unitary, X*-' = QaA'^'Qa^. 
Taking the trace of X*^ leads to equation (fTSI) . 

The asymptotic (N ^ L — > 00 with N / L constant) marginal 
distribution of the eigenvalues of X is called the empirical 
distribution of X and will be denoted /ix. Its associated 
cumulative distribution function i^x reads [8] 



i^x(A) 



1 



N 



Imi — > 1 

n^oo N ^ 
i=l 



(13) 



Therefore the free moments are directly linked to the 
empirical distribution of the matrix X, 



rrik = lim / AVx(A)dA 



(14) 



dom variable X 



which is the classical definition of moments associated to the 
distribution /ix. 
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Fig. 2. Marchenko-Pastur law 



Interestingly, for most M.sMa0 random matrices A of large 
dimensions N,L ^ oo with N/ L — c constant, the eigenvalue 
density of X = -^AA'^ converges to a definite distribution. 
For instance, in our current problem, since the input signals 
and noise N are modelled as standard i.i.d. Gaussian, we 
are interested in the so-called Wishart matrices that we define 
hereafter 

Definition 2: An N x N random matrix X = AA^, 
with A a random N x L matrix whose columns are zero 
mean Gaussian vectors with covariance matrix E, is called 
a generalized Wishart matrix of L degrees of freedom. This 
is denoted 

X~WAr(L,I]) (15) 

Wishart matrices Wn{L,In) are known to have an eigen- 
value distribution which converges, when {N, L) grows to in- 
finity with a constant ratio c — N / L, towards the Marchenko- 
Pastur law /x^^ [8]. The Marchenko-Pastur law is defined by 



Mt,, = 1 - - S{x) + (16) 

* c J zncx 

with (a, b) = ((1 - ^/cf, (1 + y/c)^). In Figure|2]we provide 
the distribution of fx^i^ for different values of c. 

Note that when c tends to 0, i.e. when L/N oo, the 
Marchenko-Pastur law converges to a single Dirac in 1 and 
we recover the classical law of large numbers. 

Equivalently to classical probability theory, many results of 
free probability involve the distribution of sum, difference, 
product and inverse of random matrices. The characteristic 
function of a distribution, used to derive the distribution of 
the sum of independent commutative random variables, has 
a free counterpart called the i?-Transform. The Mellin trans- 
form, used to derive the product of independent commutative 
random variables, also has a free counterpart, known as the 

^by usual, we qualify matrices found in common wireless communication 
problems. 



5-Transforrr0. 

Given two large random Hermitian matrices A and B, 
whose sum is C = A + B and whose product is D = AB, 
one can then derive the empirical distributions of C and D 
from the empirical distributions of A and B, which we denote 



Mc = Ma m /iB 
Md = Ma KI /iB 



(17) 
(18) 



Equation (fTTI l is called additive free convolution and equation 
(fTSl l is called multiplicative free convolution. Similarly, given 
only the distributions of B, C and D, one can recover the 
distribution of A 



Ma = Mc B Mb 
Ma = Md S Mb 



(19) 
(20) 



in which equation ( |T9b is called additive free deconvolution 
and equation ( l20b is called multiplicative free deconvolution. 

B. Free deconvolution for information plus noise model 

Our interest is to treat a particular communication model, 
known as the information plus noise model. 

Definition 3: Given two N x L {N/L — > c) large random 
matrices R (standing for an informative signal) and X (stand- 
ing for a noise additive signal) and a scalar a (the standard 
deviation of the noise process), the model given by 



W = -(R + o-X)(R + o-X)^^ 
L 



(21) 



is called the information plus noise model. We shall therefore 
call W an information plus noise matrix. 

It has been recently shown [7] that the empirical distribution 
of -^RR^ in the previous definition can be recovered from the 
empirical distributions of the matrices W and XX*^ when X 
is Gaussian with i.i.d. entries of zero mean and variance 1/L. 
This is given by 



Mf RRH = ((MW S M.)c) B ^ Mr,, 



(22) 



For more details about the demonstration of formula ( |22] |. refer 
to [16]. 

Thanks to this RMT framework and the free convolution 
tools, we can now address our multiple cell detection problem. 

VI. Application of free deconvolution to multiple 

CELL DETECTION 

A. Signal and noise deconvolution 

In the model Q, the iV x matrix -^YY^ is an in- 
formation plus noise matrix with N a Gaussian random 
matrix. Therefore, according to equation (l22b . when N, L 
are sufficiently large, one can derive the empirical distribution 

''note that independence in the classical probability sense is not enough in 
free probability to derive the empirical distribution of the sum, difference, 
product and inverse product of two random matrices. This independence 
notion is extended to the asymptotic freeness concept [8] which is not a 
technical issue in our study. 
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fotl( 



-J-HP5®© 



Hp^jjH from the empirical distribution /ij_YYH 



(out) 



OWS 



'^iHpi©©"P3HH = ((^^YY" S MrjJ B f^^^) ^ M»?e (23) 

where c — N/L, for N is of size N x L. 

Also, the matrix in equation (|5]l is modelled as standard 
i.i.d. Gaussian. Therefore ■^P^H'^HPs 00^ is a generalized 
Wishart matrix with covariance PsH'^HP^. Analogously to 
( fTSl l, this can be written 



lp5H^HP^00'^ - WAr(L,P5H^HP^) 



(24) 



Then, the empirical distribution of the covariance matrix 
P^H^^HP^ of the Wishart matrix iP^H^^HP700'^ 
can be recovered from the empirical distribution 
u , 1 ^ 1 y when the couple (N, L) tends to 

infinity with a constant ratio c' = MN/L {M is constant). 
Consequently, similarly to 



Ml- 



1 u S «„ , 



(25) 



P'lHHHPS ^-ip3HHHP7©©H 

The left side of equation ( |23] l is slightly different from the 
desired expression in the right side of equation dZST l. Still, 
thanks to the trace commutativity property, we have the link 
[8] 



1 



i-ptH"HP^©©" TVf ^iHpi©©np^H" 



(26) 

This relation is due to the fact that the positive eigen- 
values of -i;HP500^^P^H" are the same as those of 
-^P2H'^HP200 (since their traces are identical and then 
all their moments match). But the rank of both matrices 
differ and then a certain amount of null eigenvalues must 
be introduced. Here, \YiP^@@^V^2li^ is of full rank {N) 
while -iP2 H^HPa 00*^ is only of rank N for a matrix size 
MN , hence the M factor in equation ( l26b . 

Finally, we similarly connect the left side of equation 
to /iHPH" through 



A* 



P'JHHHPS 



1 

^Mhphh 



(27) 



The empirical distribution of HPH'^ was then derived from 
the empirical distribution of -^YY^. As a consequence, the 
free moments dk = E[tr7v(HPH'^)'"'] can be retrieved from 
the free moments ruk = E[trjv(-^YY'^)'^]. Surprisingly, it 
is shown in [16] that for all aforementioned free convolution 
operations, the set of the first k moments of the [de]convolved 
distributions can be recovered from the set of the k first 
moments of the operands. This substantially reduces the com- 
putational effort, as is described in the following. 

Let us work in detail all the steps to derive the moments dk 
from the moments mk- 

1) first, the noise contribution to the signal Y is decon- 
volved thanks to formula ( |23] ). The multiplicative convo- 
lution (resp. deconvolution) /i(out) of a distribution /i(;p) 
and the Marchenko-Pastur law /j,^^ can be computed 
from all the moments mt"^ . It is shown in [7] that 



can be computed from the moments/cumulants 



transform (resp. cumulants/moments transform) [8] of 
the coefficients c • with c = N/L. As for additive 

convolution (resp. deconvolution) /i(add) of two distribu- 
tions /i(a) and //(b), the free cumulants of /i(add) are the 
sum (resp. difference) of the cumulants of /i(a) and 
From ( |23] ), the moments to',, of u , i u i are 

^ ' ^iHP20©"P2HH 

obtained from the moments to^ of /ij_YY"; therefore, 
in mathematical terms, this reads 

(to'i, . . .,m'nj) = §i(mi, . . ..mM.cr^) (28) 



with 



Si = -M 

c 



e<! -e(cTOi,, 



, cniM, 



2M 



))] 



(29) 



where the functions M(-) and C(-) stand respectively 
for the cumulants/moments transform and the mo- 
ments/cumulants transform that both take as argument 
a vector of size k (the first k moments or cumulants, 
respectively) and output a size k vector (the first k 
cumulants or moments, respectively). The inner ■^C(c-) 
operation multiplicatively deconvolve the Marchenko- 
Pastur law /i^^. Then the next C application turns the 
resulting moments into cumulants. The additive decon- 
volution of 6^2 is then performed through the cumulant 
difference and the output is turned back into the moment 
space through the M application. The outer ■iM(c-) is 
finally performed to multiplicatively convolve the result 
with iijf^. 

2) the moments to'/ of u , i ^ i „ are then given 

/ k 1 P'lHHHPS©©" * 

through equation ( [26] l by a simple scaling of the mo- 



ments to'^ by 1/M. This reads 

1 



K', 



I'M) 



(to'i , . 



I'M) 



(30) 



3) the Marchenko-Pastur law /i,, , , c' = MN/L, is then 
deconvolved from /i-^P 2 H'^HP 2 00'^ according to 
equation d25T l. This leads then to the moments m'/f of 



P2H"HP2 ' 



, TO 



M 



) = -e(c'd'i,...,c'd'M) 



(31) 



4) finally, the resulting moments m'/' are scaled by M to 
obtain the dk coefficients, 

(di,...,dM) = AfK",...,TOX;) (32) 
Figure [3] summarizes these steps. 

Our final interest though is to find the diagonal values of 
P. The distribution of the channel matrix H was modelled in 
Section HI] by a mixture of correlated Gaussian subchannels. 
It is difficult, at this point of knowledge in free probability 
theory, to deconvolve the effect of H from the random matrix 
HPH'^. Only classical methods can help in this situation. 
Remarkably, it turns out that the matrix HPH'^ is diagonal 



HPH^ 







kN\ 



(33) 
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For fc e {1, . . . , Af}, computation of 


m^. = E 


tri(YY^)'^- 





r 



Deconvolution of the additive noise 



Retrieval of the dk's 



Fig. 3. m;. to cij. Block Diagram 

Therefore the theoretical moments dk of /^hph" the 
normalized traces of the asymptotic diagonal matrices of 
entries, 

k 



{(HPH^)^}^ 




(34) 



and then the p*'' order moment dp = E[tr7v(HPH'^)P] of 
HPH'^ can then be equated for large N to 



(35) 




In the following we provide a method to estimate the entries 
Pk under the assumption, which is never verified in practice, 
that the channels are extremely frequency selective, i.e. such 
that, for any k and any couple j ^ j' , the entries /i^ j and 
/ifcj' are independent. 

B. Estimation of the powers Pk 

As already concluded in Section |III] if the channel delay 
spreads are very short, then the channel frequency responses 
{hki, . . . , /ifcw}, A; G {1, . . . , A/}, are strongly correlated and 
almost nothing can be deduced on the entries of P. On the 
contrary, assume that the channel delay spread is very large, 
and rewrite equation ( l35T l as 



dp — dp 



3=1 \k=l J 



(36) 
(37) 



with dp the p*'' sample order moment {N < oo) and Wp some 
noise process. The latter converges to when N ^ oo and the 
channel frequency response is i.i.d. Gaussian. However, when 
N is finite or when the channel is less frequency selective, 
then Wp is a bit more difficult to handle. 

To push the computation forward, we need first to derive 
the classical order p moment of the variables for 
any couple {k,j), in the complex case. This gives 



(2^)!(2b-^])! 

i\{p-iy. 



(38) 



One can then derive the general expression for dp as 



A (2fc)!(2[fc, -fc])! 
(fc!)2([fc, -fcl!)2 

ki,...,kM i=i Kk=0 ^ ^ ' ' ' 



pi 

E n 



(39) 



The complete derivation of formula ( [39] ) is provided in the 
appendix. 

7 ) Bayesian approach: Let K be some integer greater than 
or equal to M. The noise process w = [wi, . . . ^wkV ^ 
already mentioned, is in general difficult to analyze. We shall 
consider in the following that one actually has a limited 
knowledge about w which reduces to the covariance matrix 
C — E[ww^] gathered from previous simulation^. Now, 
consider that the set of prior information / contains the 
following elements: 

• the values of Pi , ... , Pm are real positive and known not 
to be larger than some value Pmax- 

• the typical error variance C in the observed moments 
{dk}, fc e {1, . . . , M}, is known. 

In fact, the exact covariance matrix C cannot be known 
since its computation requires the exact knowledge of P. 
Indeed, a few lines of calculus of 



E 



(d-d)^(d-d) 



(40) 



with d = [c?i, . . . , dxY ^ lead to equation (l4Tl i. which depends 
on Pi, . . . ,Pm- 

However, let us first consider C known before we introduce 
alternative solutions when C is unknown. 

The objective is to infer on the set {Pi, . . . , Pa/} given / 
and the observed sample moments d. An error measure must 
be considered to come up with an estimate of {Pi, . . . , Pm}- 
We consider here the estimate of P which minimizes the mean 
quadratic error (MMSE). This MMSE estimate P is given by 



P = E [P|d] 

Pp(P|d,/)dP 

p(d|P,/)p(P|/) 
P /pp(d|P,/)p(P|/)dP 



dP 



(42) 
(43) 

(44) 



Since the prior information (P|/) is limited to the fact that 
all entries are upper-bounded by Pmax, p{P\I) should be set 
uniform on the space [0, Pmax]*^ according to the maximum 
entropy principle. However, note that if {Pi, . . . ,Pm} mini- 
mizes the MMSE then also does any permutation of this set. 
Therefore, to have a correctly defined problem (with a unique 
solution), the set {Pi, . . . , P/\/} must be ordered; we will then 
state in the following that Pi < P2 < Pm- Therefore the prior 
p{Pk\I) is taken uniformly on the set [0, Pfe_i] when P^-i is 
set, which leads to p(P|/) = ^max Ilfli ^ -^i^^- Also, since 
only the error covariance matrix C in the observed sample 
moments d is known, the maximum entropy principle requests 
that the process w is assigned a Gaussian distribution with 

^honesty would require that we actually derive the maximum entropy 
distribution of w but this would lead to involved computation. 
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variance C. Therefore, equation (l44l) becomes 



Pl<...<P^_ 



<-<Pa: 



j-w(P)"C-iw(P)(^p 



dP (45) 



by turning this system of equations into the equivalent system 



M 



^Pfc-Qi(di) 
fe=i 

M 

^P| = Q2(dl,d2) 



fc=l 



M 



where we denoted w = w(P) to remind the actual dependence 
of w in the powers Pi, ... , Pm (through the expression of d 
in equation ([39]l). 

Unfortunately, the integration space of equation ( |45] ) makes 
both integrals rather involved to compute. A way to practically 
computed P consists in turning the integrals into finite sums 
over thin sliced versions of the integration space. Also, as 
previously mentioned, the covariance matrix C is obviously 
unknown when trying to decipher the cell powers Pi, ... , Pm. 
However, iterative methods can be considered in which C is 
initially defined as the covariance matrix Cinit of an hypothetic 
set of powers, say Fi = F2 = ■ • ■ = Pm = -Pmax/2. 
Then, running the MMSE estimator with Cj^jt returns a first 
set P^*-^-* , . . . , PjIP from which a refined version of C can 
be evaluated (from formula (|4T])). Note that, in running k 
instances of the algorithm, the sample moments dp can be 
accumulated and the covariance matrix C has to be computed 
as if as many as kN subcarriers were actually used in the free 
deconvolution algorithm. This process can be processed in a 
loop for a satisfying number of iterations. 

2) Alternative estimators: Other estimators than MMSE, 
such as maximum-likelihood (ML), might be considered 
which take as an estimate the set P which minimizes 
w(P)'^C~^w(P). However, the measure associated to the 
ML estimator does not suit the broad a posteriori distribution 
p(P|d,/) as will be shown in simulations. Indeed, a large 
estimation error is as bad as a small estimation error in the ML 
context; therefore, when the posterior ]3(P|d, /) is not peaky, 
large estimation errors are expected. The MMSE estimator is, 
in this scenario, more appropriate. 

A zero-forcing method can also be derived. From equation 
( [39] l, if nothing were known about the noise process Wp, one 
might naively consider solving the system of M equations 
( [39] l, with p = 1, .... M in the M unknowns Pi, ... , Pm, in 
which dp is set equal to the observed dp. This can be solved 



Y,Pk' = QMidl,...,dM) 



(46) 



fe=i 



with Qk e R[di,...,dk]. 

This system can be solved using the Newton-Girard for- 
mulas [18]; the solutions Pi, ... , Pm are found to be the M 
roots of an AP^ degree polynomial. This solution does not 
require any knowledge on the covariance matrix C, which does 
not have any signification in this context. However, it often 
turns out that the roots Pk are not all real, which makes the 
solution useless in practice. Also, this zero-forcing method is 
very awkward as it strongly suffers from the presence of noise. 
Especially, the large variance E[|wmP] is considered equally 
to the typically very small variance E[|wip], and therefore 
the first sample moment di is not more important in the final 
computation than the last sample moment JaJI- 

VII. Simulation and Results 

In the following, we use the results that were previously 
derived in the case of a three-cell network that the terminal 
wishes to track. The set of cells studied along this part are of 
relative powers Pi = 4, P2 = 2, P3 = 1. 

Before performing the first simulation of the complete 
algorithm, we present in Figure H] the relative error in the 
estimation of the moments Uk = trAr(HPH'^)*'' from the 
free deconvolved sample moments dk. It is observed that 
even for N = 256, L = 512 the mean relative error in the 
computed third moments is of order 1%. This suggests that 
the free deconvolution technique is very accurate even for 
non-infinite values of N and L. The bottleneck approximation 
in the estimation of Pi, ... , Pm, as will be observed in the 
coming plots, therefore lies in the convergence of the entries 
hkj towards a Gaussian i.i.d. process, and not in the infinite 
matrix size assumption. 

In a first simulation, we study the convergence properties of 
the proposed scheme. We consider a large OFDM system with 

^this would be here again a dishonest consideration as one at least knows 
that djv/ is more uncertain than di. 
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Relative error on recovered moments U]^ = trjv(HPH^)* from 
deconvolution algorithm, L = 2N 
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Fig. 7. Cell power detection, = 512, L = 1024, 10 accumulations, 
MMSE estimate. Perfect knowledge of C 
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Fig. 8. Cell power detection CDF, N = 512, L = 1024, Perfect knowledge 
of C 



Fig. 5. Cell power detection. 
Perfect knowledge of C 



TV 



2048, L = 4096, MMSE estimate. 





Fig. 6. Cell power detection, N = 512, L = 1024, MMSE estimate (left) 
and ML estimate (right), Perfect knowledge of C 



N = 2048 subcarriers and L = 4096 sampling periods under 
ideal Gaussian i.i.d. input symbols, uncorrelated Gaussian 
channel frequency responses and Gaussian additive white 
noise with signal to noise ratio SNR = 20 dB. The covariance 
matrix C is the exact covariance matrix. Figure |5] provides the 
results of this simulation for thousand channel realisations. It 
is observed that the distribution of the eigenvalues is largely 
spread over the expected eigenvalues. This is explained by 
the slow convergence nature of the computed dp towards the 
corresponding moments when N ^ oo. Also, it is observed 
that the peak centers are offset from the expected powers. This 
is mainly due to the fact that the MMSE estimator is meant to 
minimize the mean quadratic error averaged over all possible 
sets Pi , P2 , P3 and not for the particular set selected her^ 
Also, the actually non-Gaussian property of the noise process 

'it was observed in particular from other simulations that cells of equal 
powers are generally better estimated. 
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w as well as the inexact results from the free deconvolution 
process contribute to the offset. 

Figure |6] provides a comparison between the results given 
by the MMSE estimator and the ML estimator when N = 
512 and L = 1024. It turns out, as discussed earlier, that the 
ML estimate is more largely spread around the expected cell 
powers than the MMSE estimate. 

In the following, we perform more realistic simulations in 
which input signals are QPSK modulated instead of complex 
Gaussian, and with more realistic channels of length varying 
from 1 to N/4 symbols. To reduce the variance of the 
estimates, we also average the sample dp over ten channel 
realizations, which in practice requires around 1 ms of data 
to process. In Figure |7] we took N = 512, L = 1024 and a 
Rayleigh channel of length iV/8. The covariance matrix C is 
still the exact covariance matrix. Then a hundred realisations 
of this process are run. The SNR is still SNR = 20 dB in 
this second experiment. The cumulative distribution function 
(CDF) of the detected powers is presented in Figure [8] and 
compared to the CDF when no accumulation is performed. 
The three thresholds corresponding to the three detected cell 
powers can be observed, with a slight shift from the expected 
cell powers caused both by the non-exact Gaussian assumption 
on w and on the non-exact Gaussian i.i.d. assumption on the 
channel frequency responses. 

Now we propose to examine the performance of the iterative 
cell power recovery. We initially set Ci^jt to the covariance 
matrix of a set of cells of powers Pi = P2 = P3 = 2.5. 
Then ten iterations of the cell-power detector are run, with 
at each step a refinement of C. Note that, since at each 
step the sample moments dp are accumulated, the entries 
of the covariance matrix C are computed accordingly, i.e. k 
accumulations demand that C is computed from an effective 
number kN subcarriers, as presented in Section [Vl] Figure |9] 
provides the results of the recursive algorithm, which proves 
to performs very accurately, with surprisingly little dispersion 
around the detected powers compared to Figure |7] 



Channel Type 


RMS delay spread 


Channel length 


EVA 


357 ns 


N/27 


ETU 


991 ns 


N/13 


TABLE II 



3GPP-LTE STANDARDIZED SHORT DELAY CHANNELS 



Also, we test the robustness of our algorithm against prac- 
tical short channels, instead of high delay spread channels. 
This is shown in Figure [TO] which proposes a comparison 
between the ideal long channel situation and the 3GPP-LTE 
[4] standardized Extended Vehicular A (EVA) and Extended 
Typical Urban (ETU) channels with parameters given in Table 

m 

Here we considered a mobile handset with Nr — 2 antennas, 
256 subcarriers per antenna (and then in total an effective 
number of = 512 subcaiTiers), L = 1024, SNR = 20 dB. 
The covariance matrix C of the noise process w is an 
approximated matrix obtained from intensive simulations on 
short channels. Indeed, the covariance pattern is very different 
from the matrices used in the Gaussian i.i.d. scenario and 
is difficult to derive analytically; especially we noticed that 
the smaller the delay spread, the larger the uncertainty on the 
higher moments, which turns C into an ill-conditioned matrix. 
The results are averaged over ten channel realizations. The 
CDF of the detected power distribution for those channels 
is provided in Figure [TO] The latter shows a rather good 
behaviour in both ETU and EVA channels. Nonetheless their 
short delay spreads lead to a larger variance in the mean power 
estimation. 

Surprisingly, it turns out that the distribution of the input 
signals s'^'^ does not impact the system performance as long 
as it is i.i.d. with zero mean and unit variance. This is a 
known result in free deconvolution which has not been proven 
yet. Therefore in our simulations, QPSK modulations showed 
the exact same behaviour as purely Gaussian distributed input 
signals. 
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VIII. Discussion 



A. Data, prior and convergence 

We previously described and simulated a recursive algo- 
rithm meant to converge to an accurate estimate of the cell 
powers Pi, ... , Pm, whose convergence we did not prove yet. 
Actually, a mathematical as well as a philosophical reasons for 
this convergence can be advanced. First, note that a large num- 
ber of iterations lead to a smaller variance of w = d — d(P), 
therefore to smaller entries of the noise covariance matrix C. 
As a consequence C^^ has large entries and the exponential 
terms e-5(d-d(P)rc-i(d-d(P)) MMSE integrals (|45]l 

are relevant only when the differences d— d(P) are very small; 
this is, when d(P) ~ d. Therefore the accuracy in the entries 
of C^^ is not of fundamental importance, as long as they 
are large enough. The convergence of the iterative process is 
ensured by the accumulated data d themselves; but of course 
this convergence is accelerated with good approximations of 
C. 

This observation is very general in the Bayesian probability 
theory context and has been observed and analyzed thoroughly 
by Jaynes [15]. Bayesian probabilities rely on a balance 
between priors and data. If the available data is scarce, then 
prior information is very valuable; in our situation, as shown 
by simulation, if C were known, then even a single channel 
realization allows to approximately recover the transmitted 
powers. On the contrary, large amounts of data prevail over 
prior information so that even unfortunate priors may not badly 
alter the a posteriori probability; this explains why a precise 
estimation of C is not mandatory when large accumulations 
of data are considered. Note that in this case, ML estimates 
and MMSE estimates show similar performance, since the 
posterior distribution p{P\Y,I) is very peaky in the correct 
value for P. 

However, it is important to underline the fact that the 
offset problem, observed in simulation, in the estimation of 
Pi, ... , Pm will not be reduced by mere data accumulations. 
Indeed, the issue comes here from the free deconvolution 
process which is not accurate for finite N. Therefore, an appro- 
priate trade-off between large N and many accumulations must 
be found: large N entail more accurate free deconvolution 
processes at the expense of computationally demanding large 
matrix products, while many accumulations ensures a faster 
convergence (to offset cell powers) at a lower computation 
cost. Another approach consists in computing higher order 
moments to strengthen the cell power estimation. However, 
high order sample moments have a large variance for finite N. 
Their impact on the final estimation might then be very limited 
since they are very unreliable. For instance we observed in 
simulations that, for N = 512, typical matrices C verify 
Cii ~ le~^ and C33 ~ le^, which gives a million times 
more credit to the first order sample moment than to the third 
order sample moment. Bringing in the computation fourth 
order moments with = 512 would turn out not worth the 
computation increase. 



B. Applicability 

As reminded in the introduction of this paper, usual cell 
power detection techniques use scarce and largely interfered 
synchronization sequences. Much time, but low computation, 
is then required to detect cells with high efficiency. From 
the authors' experience in the context of 3GPP-LTE, many 
problems arise when those pilot sequences are synchronized 
and the emitting cells have equal powers since both signals 
interfere to the detriment of the decoder. In the previously 
derived scheme, even cells of equal power can be counted 
and isolated. Indeed, if M cells are expected and the CDF of 
the estimated powers shows a large jump of 2/AI for a given 
power P, then we can deduce that two cells have equal power 
P. It is therefore not necessary to enhance the quality of the 
estimation to separate those two cells. 

However, some limitations can be found in the applicability 
of this cognitive scheme. Firstly, loaded cells are required 
to ensure reliability of the estimates; this requires that the 
underlying standards optimally reuse the allocated bandwidth. 
Secondly, if many cells are desired to be detected, then very 
high order moments must be computed which, as discussed 
earlier are only reliable if the number of subcarriers N and 
the number of accumulations are very large. High order 
moments also require very large matrix products, which might 
be too demanding to the embedded system processors. Also, 
under the limited state of knowledge of the system model, 
neighboring cells can only be detected if the propagation 
channels are very frequency selective. The flatter the channels, 
the more numerous the accumulations required to come up 
with a reliable estimation. The latter limitation is obviously 
the most constraining factor. 



IX. Conclusion 

In this paper, we provided a practical way to blindly detect 
neighboring cells in a distributed OFDM network. Assuming 
constant transmission in those cells on a fairly large band- 
width, we showed that one can determine the individual SNR 
of every surrounding cell provided that the channel delay 
spread is sufficiently large. This scheme is particularly suited 
for the future cognitive OFDM systems which aim to reduce 
the amount of synchronization sequences while keeping track 
of the neighboring cells. 
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Appendix 

Consider Gaussian channels with independent frequency 
responses hij (i E {1, . . . , M}, j £ [1, N]). Then, for a given 
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j, noise taken apart, we denote 



/ M \P 

M 



kx,k2,...,kM 

J2m km=P 



1=1 



(47) 



where the product of the binomial coefficients 



fei!fc2!...fcM!- 



is the muhinomial coefficient 



Hence the simpHfied expression, when averaging dp over 
J e [1, N] (with TV to ensure ^ ^ [|/^^Jf ]) 



A;i,fc2,...,fcM 



(48) 



The moments of the complex variable j/iy p are deduced 
from the moments of a Gaussian real variable by considering 

(49) 

where the real and imaginary parts /i^ and lif^ of hij are 
standard Gaussian variables whose even moments are known 

{2p)\ 



(50) 



hence equation 
This results in 



ki 



ki,...,kM i=l U=0 » J ; 



(51) 
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